General //-theorem and entropies that violate the second law 
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We study systems with finite number of states Ai, which obey the first order kinetics (master 
equation). A general criterion is found for the existence of _ff-theorem with given H. A convex 
function _ff is a Lyapunov function for all master equations with the given equilibrium if and only 
if its conditional minima properly describe the equilibira of pair transitions Ai ^ Aj . This theorem 
does nod depend on the principle of detailed balance and is valid both for reversible and for general 
Markov kinetics. Analysis of pair equilibria demonstrates, for example, that the popular Bregman 
divergences like Euclidian distance or Itakura-Saito divergence in the space of distributions can- 
not be the universal Lyapunov functions for the first-order kinetics and increase in some Markov 
processes. 



The first non-classical entropy was proposed by Renyi 
in 1960 In the same paper, he discovered the very 
general class of divergences, the so-called /-divergences 
(or Csiszar-Morimoto divergences because the works of 
[1,3 published simultaneously in 1963): 



(1) 



where P = (pi) is a probability distribution, P* is an 
equiUbrium distribution, h{x) is a convex function de- 
fined on the open {x > 0) or closed x > semi-axis. We 
use here the notation Hii{P\\P*) to stress the dependence 
of Hh both on pi and p* . 

These divergences have the form of the relative en- 
tropy or, in the thermodynamic terminology, the (nega- 
tive) free entropy, the Massieu-Planck functions |4|, or 
F / RT where F is the free energy. They measure the de- 
viation of the current distribution from the equilibrium. 

After 1961, many new entropies and divergences were 
invented and applied to real problems, including Burg 
entropy [5i] , Cressie-Red family of power divergences [1] , 
Tsallis entropy [1,0], families of a-, {3- and 7-divergences 
[§| and many others (see the review papers [ToriTlj'). 
Many of them have the /-divergence form, but some of 
them have not. For example, the squared Euclidean dis- 
tance from P to P* is not, in general, a /-divergence 
unless all p* are equal (equidistribution) . Another exam- 
ple gives the Itakura-Saito divergence: 



y ^_lnPi_l 
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The idea of Bregman divergences [12| provides a new gen- 
eral source of divergencies that may differ from the /- 
divergences. Any strictly convex function in a closed 
convex set V satisfies the Jensen inequality 

Df{p. q) = F{p) - F{q) - {VqF{q),p - q) > , 

if p ^ q, p,q E V. The positive quantity Dp{p,q) is 
the Bregman divergence associated with F . For exam- 
ple, for a positive quadratic form F(x) the Bregman di- 



vergence is just Dp{p,q) — F{p — q). In particular, if 
F is the squared Euclidean length of x then Df{p, q) is 
the squared Euclidean distance. If F is the Burg en- 
tropy, F{x) = — ^i^T^Pi, then Dp[p,q) is the Itakura- 
Saito divergence. The Bregman divergences have many 
attractive properties. For example, the mean vector min- 
imizes the expected Bregman divergence from the ran- 
dom vector [13|. The Bregman divergencies are conve- 
nient for numerical optimization because the generalized 
Pythagorean identity [l3| • 

For information processing and for many physical ap- 
plications one more property is crucially important. The 
divergence between the current distribution and equilib- 
rium should monotonically decrease in Markov processes. 
This is the ultimate requirement for use of the diver- 
gence in information processing and in non-equilibrium 
thermodynamics and kinetics. In physics, the first re- 
sult of this type was Boltzmann's TJ-theorem proven for 
nonlinear kinetic equation. In information theory. Shan- 
non |15| proved this theorem for the entropy ("the data 



processing lemma") and Markov chains. The 77-theorem 
corresponds to the second law of thermodynamics. The 
data processing lemma reflects the general principle: in- 
formation does not increase in random manipulations. 

In his well-known paper [l|, Renyi also proved that 
Hh{P\\P*) monotonically decreases in Markov processes 
(he gave the detailed proof for the classical relative en- 
tropy and then mentioned that for the /-divergences it 
is the same). This result, elaborated further by Csiszar 
[1] and Morimoto [3] embraces many later particular H- 
theorems for various entropies including the Tsallis en- 
tropy and the Renyi entropy (because it can be trans- 
formed into the form ([l]) by a monotonic function, see for 
example |lli] ). The generalized data processing lemma 
was proven [l^, [l3| : For every two positive probability 
distributions P, Q the divergence Hii{P\\Q) decreases un- 
der action of a stochastic matrix A — (aij) 

HhiAP\\AQ) <a{A)HhiP\\Q), 



where a{A) 




akj\ 
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is the ergodicity contraction coefScient, < a{A) < 1. 
Here, neither Q nor P must be the equihbrium distribu- 
tion: divergence between any two distributions decreases 
in Markov processes. 

Under some additional conditions, the property to 
decrease in Markov processes characterizes the /- 
divergencies [H, [l^. For example, if a divergence de- 
creases in all Markov processes, does not change under 
permutation of states and can be represented as a sum 
over states (has the trace form) then it is the /-divergence 
fll, 18]. 

Dynamics of distributions in the continuous time 
Markov processes is described by the master equation. 
Thus, the /-divergencies are the Lyapunov functions for 
the master equation. The important property of the di- 
vergencies Hh{P\\P*) is that they are the universal Lya- 
punov functions. That is, they depend on the current 
distribution P and on the equilibrium P* but do not de- 
pend on the transition probabilities directly. Of course, 
without additional conditions like the trace form, the 
class of the universal Lyapunov functions for the mas- 
ter equations is much wider. For each new divergence 
we have to analyze its behavior in Markov processes and 
to prove or refute the iJ-theorem. For this purpose, we 
need a general and constructive criterion. It is desirable 
to avoid any additional requirements like the trace form 
or symmetry. In this paper we develop this criterion. 

Obviously, the equilibrium P* is a global minimum of 
any Lyapunov function H{P) in the simplex of distri- 
butions. In brief, the general H -theorem states that a 
convex function H{P) is a universal Lyapunov function 
for the master equation if and only if its conditional min- 
ima correctly describe the partial equilibria for pairs of 
transitions Ai ^ Aj. These partial equilibria are given 
by proportions Pi/p* = pj/p*. They should be solutions 
to the problem 



H{P) — )> min subject to > (fc = 1, . . . , n) 

n 

Pk — f, and given values oi pi {I i, j) . 



(3) 
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We prove this general _ff-theorem and then analyze 
several Bregman's divergencies, which are not the /- 
divergencies, and demonstrate that they do not allow 
i?-theorem even for systems with 3 states. 

Three forms of master equation and the decomposition 
theorem. We consider continuous time Markov chains 
with n states Ai, . . . , A„. The master equation for the 
probability distribution P = (jpi) is 



dpi 
dt 



ilijPj 



qjiPi 



1,.. 



(4) 



where qij {i,j — l,...,ri, i ^ j) are non-negative. In 
this notation, q^j is the rate constant for the transition 
Aj — >■ Ai. Any set of non- negative coefficients q^j (i ^ j) 
corresponds to a master equation. Therefore, the class 



of the master equations can be represented as a non- 
negative orthant in M"("~i) with coordinates qij {i ^ 
j). Equations of the same class describe any first order 
kinetics in perfect mixtures. 

Now, let us restrict our consideration to the set of the 
Markov chains with the given positive equilibrium distri- 
bution P* {p* > 0). 



J2 i'^p*j 



X! I P*^ alH = 1, . . . , n. (5) 



We join the transitions Ai ^ Aj in pairs (say, i > j) 
and introduce the stoichiometric vectors 7-^* with coordi- 
nates: 
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Tk = 



-1 iik=j, 
1 if A; = i, 
otherwise. 



(6) 



Let us rewrite the master equation Q in the quasichem- 
ical form: 



(7) 



where w^j = qijP*j^ is the rate of the transitions Aj 
Ai and w^j — qjiP*^ is the rate of the reverse process 
Aj ^ A^ {i> j). 

Systems with detailed balance form an important class 
of first order kinetics. The detailed balance condition 
reads: at equilibrium, 



w^j, I.e. 



qjtP* (= w*j) i,j 1, 



(8) 



Here, ly*^ is the equilibrium flux from Ai to Aj and back. 

For the systems with detailed balance the quasichemi- 
cal form of the master equation is especially simple: 



(9) 



It is important that any set of non-negative equilibrium 
fluxes w*j {i > j) defines a system with detailed balance 
dSl) with a given positive equilibrium P* . Therefore, the 
set of all systems ([SJ for any given equilibrium may be 

Jl(Tl-l) 

represented as a non-negative orthant in i? 2 with 
coordinates Wij {i > j). 

The decomposition theorem [20l . [2l| states that for any 
given positive equilibrium P* and any positive distribu- 
tion P the set of possible values dP/dt for equations ([7]) 
under the balance condition ([5]) coincides with the set 
of possible values dP/dt for equation © under the de- 
tailed balance condition ([8]). In other words, for every 
general system ([7|) with positive equilibrium P* and any 
given non-equilibrium distribution P there exists a sys- 
tem with detailed balance (jH]) with the same equilibrium 
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and the same value of the velocity vector dP/dt at point 
P. Therefore, the sets of the universal Lyapunov func- 
tion for the general master equations and for the master 
equations with detailed balance coincide. 

General H -theorem. Let H{P) be a convex function 
on the space of distributions. It is a Lyapunov function 
a master equations with the positive equilibrium P* if 
dH{P{t))/dt < for any positive solution P{t). For a 
system with detailed balance (jH]) 

dH{P{t)) ^ ^ , (p, p,\ f dHjP) dH{P) \ 
dt ^ ^^[p* pl) \ dp, dp, ) 

(10) 

The inequality dH{P{t))/dt < is true for all non- 
negative values of w*j if and only is it holds for any term 
in (fTO]) separately. That is, for any pair i,j {i > j) the 
convex function H{P) is a Lyapunov function for the sys- 
tem © where only one w*^ is not zero. 

A convex function on a straight line is a Lyapunov 
function for a ID system with single equilibrium if and 
only if the equilibrium is a minimizer of this function. 
This elementary fact together with the previous observa- 
tion gives us the criterion for universal Lyapunov func- 
tions for systems with detailed balance. Let us introduce 
the partial equilibria criterion: 

Definition 1 A convex function H(P) on the simplex 
of probability distributions satisfies the partial equilibria 
criterion with a positive equilibrium P* if the proportion 
Pi/Pi = Pj/Pj minimizers in the problem 

Remark 1 The partial equilibria criterion means that 
the partial equilibrium proportions Pi/p* — Pj/Pj ffwe 
the minimizers in the problem (0) but it does not imply 
that these proportion give all the minimizers. If H{P) is 
a convex but not strictly convex function then the min- 
imizers for the given values of pi (I ^ i,j) may form a 
segment which contains the partial equilibrium. Such a 
segment is a face of the level set of H{P) in the plane 
with the given values of pi (I ^ i^j). Good example give 
the divergencies 

-ffoo(/'||-P*) =niax(4l - 1 and 

H_«,(P||P*) = maxj^Ul. 

« VPi } 

and their convex combinations flJlJ - 

Proposition 1 A convex function H[P) on the simplex 
of probability distributions is a Lyapunov function for 
all master equations with the given equilibrium P* which 
obey the principle of detailed balance if and only if it sat- 
isfies the partial equilibria criterion with this equilibrium. 

Combination of this Proposition with the decomposition 
theorem [20] gives the same criterion for general master 
equations without hypothesis about detailed balance 



Proposition 2 A convex function H{P) on the simplex 
of probability distributions is a Lyapunov function for all 
master equations with the given equilibrium P* if and 
only if it satisfies the partial equilibria criterion with this 
equilibrium. 

These two Propositions together form the general H- 
theorcm. 

Theorem 1 The partial equilibria criterion with a pos- 
itive equilibrium P* is a necessary condition for a con- 
vex function to be the universal Lyapunov function for 
all master equations with detailed balance and equilib- 
rium P* and a sufficient condition for this function to 
be the universal Lyapunov function for all master equa- 
tions with equilibrium P* . 

Let us stress that here the partial equlibria criterion pro- 
vides a necessary condition for systems with detailed bal- 
ance (and, therefore, it is necessary for more general sys- 
tems without detailed balance assumption) and a suffi- 
cient condition for the general systems (and, therefore, 
for the narrower class of systems with detailed balance). 

Examples. The simplest Bregman divergence is the 
squared Euclidean distance between P and P* , J^iiPi ~ 
p*Y. The solution to the problem ^ is: Pi-p* = Pj -p*. 
Obviously, it differs from the proportion required by the 
partial equilibria criterion (Fig. [1^). For the Itakura- 
Saito divergence ^ the solution to the problem (jS]) is: 

\: — — V. It also differs from the proportion 

Pi Pi Pi Pj ^ ^ 

required (Fig. [TJj) . 

If the single equilibrium in ID system is not a min- 
imizer of a convex function H then dH/dt > on the 
interval between the equilibrium and minimizer of H (or 
minimizers if it is not unique). Therefore, if H{P) does 
not satisfy the partial equilibria criterion then in the sim- 
plex of distributions there exists an area bordered by the 
partial equilibria surface for Ai ^ Aj and by the mini- 
mizers for the problem ([3]), where for some master equa- 
tions dH/dt > (Fig. [IJ. In particular, in such an area 
dH/dt > for the simple system with two transitions, 
Ai ^ Aj , and the same equilibrium. 

Discussion. Many non-classical entropies are in- 
vented and applied to various problems in physics and 
data analysis. The problem of /f-theorem appears in 
many areas of research and and is important for practical 
applications. A good example give us the discussion of 
if-theorem with the classical and non-classical entropies 
for the lattice Boltzmann methods [l^, , a field of sci- 
ence between kinetic theory and numerical analysis. 

We suggest that if an entropy has no i/-theorem (that 
is, it violates the second law and the data processing 
lemma) then there should be unprecedentedly strong rea- 
sons for its use. Without such strong reasons we cannot 
employ it. In this paper, the general criterion for the ex- 
istence of _ff-theorem is proved. It has a simple and phys- 
ically transparent form: the convex divergence (relative 
entropy) should properly describe the partial equilibria 
for all pair transitions Ai ^ Aj. It is straightforward 



FIG. 1: The triangle of distributions for the system with three states Ai, A2, A3 and the equihbrium pi = |, = f , P3 = 
The Unes of partial equilibria Ai ^ Aj given by the proportions Pi/p* — Pj/pj are shown, for Ai ^ A2 by solid straight lines 
(with one end at the vertex ^43), for A2 ^ A3 and for Ai ^ A3 by dashed lines. The lines of conditional minima of H{P) (|3]) 
are presented for the partial equilibrium A^ ^ A2: (a) for the squared Euclidean distance this is a straight line given by the 
equation pi — Pi — P2 — P2 (a circle here is an example of the H{P) level set), and (b) for the Itakura-Saito divergence this 
is a curve given by the equation — — — ^ k-. Between these lines and the line of partial equilibria the "No //-theorem 

zone" is situated. In this zone, H{P) increases in time for some master equations with equilibrium P* . Similar zones (not 
shown) exist near other partial equilibrium lines too. Outside these zones, H{P) monotonically decreases in time for any master 
equation with equilibrium P* . 



to check this partial equilibria criterion. The applica- 
bility of this criterion does not depend on the detailed 
balance condition and it is valid both for the class of the 
systems with detailed balance and for the general first 
order kinetics without this assumption. We describe the 
universal Lyapunov function for the master equations in 



a form of the constructive criterion which allows us to 
test any convex function. It is possible to look for alter- 
native description of the universal Lyapunov function in 
the form of constructive parametric representation. We 
can expect new promising classes of entropies from this 
direction of research in the future. 
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